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Abstract 

Parametric estimation for diffusion processes is considered for high frequency ob¬ 
servations over a fixed time interval. The processes solve stochastic differential equa¬ 
tions with an unknown parameter in the diffusion coefficient. We find easily verified 
conditions on approximate martingale estimating functions under which estimators are 
consistent, rate optimal, and efficient under high frequency (in-fill) asymptotics. The 
asymptotic distributions of the estimators are shown to be normal variance-mixtures, 
where the mixing distribution generally depends on the full sample path of the diffu¬ 
sion process over the observation time interval. Utilising the concept of stable con¬ 
vergence, we also obtain the more easily applicable result that for a suitable data 
dependent normalisation, the estimators converge in distribution to a standard normal 
distribution. The theory is illustrated by a simulation study comparing an efficient and 
a non-efficient estimating function for an ergodic and a non-ergodic model. 

Key words: Approximate martingale estimating functions, discrete time sampling of 
diffusions, in-fill asymptotics, normal variance-mixtures, optimal rate, random Fisher 
information, stable convergence, stochastic differential equation. 
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1 Introduction 


Diffusions given by stochastic differential equations find application in a number of fields 
where they are used to describe phenomena which evolve continuously in time. Some ex- 


amples include agronomy ( 

Pedersen;, 2000j), biology (Favetto and Samson 

201C 

)), finance 

(Merton, 1971 VasicdtJ 11977; Cox et al., 1985; Larsen and Sprensen 

2007) and neuro- 

science (Ditlevsen and Lansky, 2006 

Picchini et al.l 2008; 

Bibbona et al. 

>010) 



While the models have continuous-time dynamics, data are only observable in discrete 
time, thus creating a demand for statistical methods to analyse such data. With the ex¬ 
ception of some simple cases, the likelihood function is not explicitly known, and a large 
variety of alternate estimation procedures have been proposed in the literature, see e.g. 


Sprensen (2004 1 and Kessler et al. (2012 1 . Parametric methods include the following. 


Maximum likelihood-type estimation, primarily using Gaussian approximations to the like- 


iihood function, was considered by Prakasa Rao (1983), 

Florens-Zmirou ( 

1989) 

Yoshida 

(1992), 

Genon-Catalot and Jacod (1993), Kessler (1997), 

facod 

(2006) 

Gloter and Sprensen 


(2009) andjUchida and Yoshida (2013). Analytical expansions of the transition densities 

were investigated by Alt-Sahalia (]2002| 2008) and Li (2013), while approximations to the 

score function were studied by Bibby and Sprensen ( 

1995), Kessler and Sprensen (1999), 

Jacobsen (2001,2002),[Uchida (2004), and Sprensen 

(2010). Simulation-based likelihood 

methods were developed by Pedersen (1995), Roberts and Stramer (2001), 

Durham and 

Gallant (2002), Beskos et al. (2006 2009), Golightly and Wilkinson (2006 

2008), Bladt 

and Sprensen (2014), and Bladt et al. (2016). 


A large part of the parametric estimators proposed in the literature can be treated within 


the framework of approximate martingale estimating functions, see the review in Sprensen 


( 2012[ ). In this paper, we derive easily verified conditions on such estimating functions that 
imply rate optimality and efficiency under a high frequency asymptotic scenario, and thus 
contribute to providing clarity and a systematic approach to this area of statistics. 

Specifically, the paper concerns parametric estimation for stochastic differential equations 
of the form 


dX, = a{X t ) dt + b{X t \G) dW, , (1.1) 

where (W t ) t >o is a standard Wiener process. The drift and diffusion coefficients a and b are 
deterministic functions, and 6 is the unknown parameter to be estimated. The drift function 
a needs not be known, but as examples in this paper show, knowledge of a can be used 
in the construction of estimating functions. For ease of exposition, X, and 6 are both as¬ 
sumed to be one-dimensional. The extension of our results to a multivariate parameter is 
straightforward, and it is expected that multivariate diffusions can be treated in a similar 
way. For n e N, we consider observations (Y,«, X t «,..., A,») in the time interval [0,1], at 
discrete, equidistant time-points f! = i/n, i = 0,1,..., n. We investigate the high frequency 
scenario where n —» oo. The choice of the time-interval [0,1] is not restrictive since results 
generalise to other compact intervals by suitable rescaling of the drift and diffusion coef- 
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ficients. The drift coefficient does not depend on any parameter, because parameters that 
appear only in the drift cannot be estimated consistently in our asymptotic scenario. 


It was shown by Dohnal 


(|1987[) and Gobet (2001) that under the asymptotic scenario con¬ 


sidered here, the model ( l.l| is locally asymptotic mixed normal with rate \fn and random 
asymptotic Fisher information 


m = 


- 2 f 


d e b(X s -6) 
b{X s -G ) 


ds. 


( 1 . 2 ) 


Thus, a consistent estimator 8 n is rate optimal if \fn(6 n - 8q) converges in distribution 
to a non-degenerate random variable as n —* oo, where % is the true parameter value. 
The estimator is efficient if the limit may be written on the form where Z is 

standard normal distributed and independent of I(8q). The concept of local asymptotic 
mixed normality was introduced by Jeganathanl( |1982| ), and is discussed in e.g. Le Cam 
and Yang| ( |2000| Chapter 6) and |Jacod| ( 2010 i. 

Estimation for the model ( |1.1| ) under the high frequency asymptotic scenario described 
above was considered by |Genon-Catalot and Jacod (1993] 1994). These authors proposed 
estimators based on a class of contrast functions that were only allowed to depend on the 
observations and the parameter through b (X,» { ; 8) and A„ (X,n - X t >> ). The estimators 


were shown to be rate optimal, and an efficient contrast function was identified. Dohnal 
(1987j) gave estimators for particular cases of the model ( |1.1[ ). Apart from one instance, 
these estimators are not of the type investigated by Genon-Catalot and Jacod (1993] 1994 1 , 
but all apart from one are covered by the theory in the present paper. 

In this paper, we investigate estimators based on the extensive class of approximate mar¬ 
tingale estimating functions 


G 


n 


n {0) = )_^ g {\ n ,X t n,X tU -8) 
i= 1 

with A„ = l/n, where the real-valued function g(t, y, x\ 8) satisfies that E rt (y(A„, X,», X,« ; d) | 
X t " is of order A* for some k > 2. Estimators are obtained as solutions to the estimat¬ 
ing equation G n {8) = 0 and are referred to as G„-estimators. Exact martingale estimating 
functions, where G n {8) is a martingale, constitute a particular case that is not covered by 
the theory in Genon-Catalot and Jacod (1993 1994[ ). An example is the maximum likeli¬ 
hood estimator for the Ornstein-Uhlenbeck process with a(x) = -x and b(x: 8) - \[f), for 
which g(t,y, x',8) = (y- e~‘x) 2 - ^d(l - e~ 2t ). A simpler example of an estimating function 
for the same Ornstein-Uhlenbeck process that is covered by our theory, but is not of the 
Genon-Catalot & Jacod-type, is given by g{t,y, x\ 8) = (y - (1 - t)x ) 2 - 8t. 


The class of approximate martingale estimating functions was also studied by Sprensen 


(2010), who considered high frequency observations in an increasing time interval for a 


model like ( |1.1[ ) where also the drift coefficient depends on a parameter. Specifically, the 
observation times were t" = iA n with A„ —> 0 and nA„ —> oo. Simple conditions on 
g for rate optimality and efficiency were found under the infinite horizon high frequency 
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asymptotics. To some extent, the methods of proof in the present paper are similar to those 
in Sprensen (2010). However, while ergodicity of the diffusion process played a central 


role in that paper, this property is not needed here. Another important difference is that 
expansions of a higher order are needed in the present paper, which complicates the proofs 
considerably. Furthermore, the theory in the current paper requires a more complicated 
version of the central limit theorem for martingales, and we need the concept of stable 
convergence in distribution, in order to obtain practically applicable convergence results. 

First, we establish results on existence and uniqueness of consistent G n -estimators. We 
show that yfn(9 ,, - $o) converges in distribution to a normal variance-mixture, which im¬ 
plies rate optimality. The limit distribution may be represented by the product W(9$)Z of 
independent random variables, where Z is standard normal distributed. The random vari¬ 
able W(8q) is generally non-degenerate, and depends on the entire path of the diffusion 
process over the time-interval [0,1]. Normal variance-mixtures were also obtained as the 


asymptotic distributions of the estimators of Genon-Catalot and Jacodj (1993). These dis¬ 
tributions appear as limit distributions in comparable non-parametric settings as well, e.g. 
when estimating integrated volatility ( Jacod and Protter [ 1998[ Myklan d and Zhang[ 2006) 
or the squared diffusion coefficient (Florens-Zmirou 1993 Jacod| 20001. 

Rate optimality is ensured by the condition that 


dyg(0, x, x; 6) = 0 


(1.3) 


for all x in the state space of the diffusion process, and all parameter values 9. Here 
d y g{ 0, x, x; 6) denotes the first derivative of y(0.v, x; 9) with respect to y evaluated in y = x. 
The same condition was found in Sprensen ( 2010j ) for rate optimality of an estimator of the 
parameter in the diffusion coefficient, and it is one of the conditions for small A-optimality; 
Jacobsen ( |2001[f2002] ). 


see 


Due to its dependence on (Ay).re|0,i], the limit distribution is difficult to use for statistical 
applications, such as constructing confidence intervals and test statistics. Therefore, we 
construct a statistic W n that converges in probability to W(9o). Using the stable convergence 
in distribution of yfn(9 n - 9q) towards W(6o)Z, we derive the more easily applicable result 
that yfn W~ ] (8„ - #o) converges in distribution to a standard normal distribution. 


The additional condition that 


dlg(0,x,x-,9) = K e 


dgb 2 (x\ 9) 


(1.4) 


b 4 (x; 9) 

(Kg + 0) for all x in the state space, and all parameter values 9, ensures efficiency of G 


estimators. The same condition was obtained by Sprensen (2010) in his infinite horizon 
scenario for efficiency of estimators of parameters in the diffusion coefficient. It is also 


identical to a condition given by Jacobsen (2002) for small A-optimality. The identity of 


the conditions implies that examples of approximate martingale estimating functions which 


are rate optimal and efficient in our asymptotic scenario may be found in Jacobsen (2002) 


and Sprensenj( j2010| ). In particular, estimating functions that are optimal in the sense of 
Godambe and Heyde|(jT987]) are rate optimal and efficient under weak regularity conditions. 
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The paper is structured as follows: Section[2]presents definitions, notation and terminology 
used throughout the paper, as well as the main assumptions. Section [3] states and discusses 
our main results, while Section[4]presents a simulation study illustrating the results. Section 
[5] contains main lemmas used to prove the main theorem, and proofs of the main theorem 
and the lemmas. Appendix [A] consists of auxiliary technical results, some of them with 
proofs. 


2 Preliminaries 


2.1 Model and Observations 

Let (Q, T) be a measurable space supporting a real-valued random variable U , and an inde¬ 
pendent standard Wiener process W = (W)) f >o- Let ( r F I )t >0 denote the filtration generated 
by U and W. 

Consider the stochastic differential equation 

dX, = a(X r ) dt + b(X t \6)dW ,, X 0 = U, (2.1) 


for 8 e 0 c R. The state space of the solution is assumed to be an open interval X c R, 
and the drift and diffusion coefficients, a : X —» R and b : X x 0 —» R, are assumed to be 
known, deterministic functions. Let (Pe)e e © be a family of probability measures on (Q, T) 
such that X = (X t ) t >o solves (|2. 1 under Pg, and let Eg denote expectation under Pg. 


Let t'J = ;A„ with A„ = \/n for / 6 No, n e N. For each n e ft. X is assumed to be sampled 
at times t'J, i - 0,1,..., n, yielding the observations (X^, Xq ,..., X t n). Let Q„ i denote the 
cr-algebra generated by the observations (X,n , X t n ,... ,Xy), with Q n = Q, hn - 


2.2 Polynomial Growth 

In the following, to avoid cumbersome notation, C denotes a generic, strictly positive, real¬ 
valued constant. Often, the notation C u is used to emphasise that the constant depends 
on u in some unspecified manner, where u may be, e.g., a number or a set of parameter 
values. Note that, for example, in an expression of the form C„(l + M c "), the factor C u and 
the exponent C u need not be equal. Generic constants C u often depend (implicitly) on the 
unknown true parameter value 6q, but never on the sample size n. 

A function / : [0,1] X X 2 x 0 — * R is said to be of polynomial growth in x and y, uniformly 
for t € [0,1] and 8 in compact, convex sets, if for each compact, convex set K c 0 there 
exist constants Ck > 0 such that 

sup \f(t, y, X- 0)1 <C K { 1 + \x\ c * + | y\ CK ) 
f€[O,l],0eSr 

for x,y e X. 

Definition 2.1. Cp°^,-([0,1] x A 2 x 0) denotes the class of real-valued functions f(t, y, x; 8) 
which satisfy that 
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(i) / and the mixed partial derivatives d l t dld k g f(t, y, x; 6), i - 0,..., p, j - 0,..., q and 
k = 0,..., r exist and are continuous on [0,1] x X 2 x 0. 


(ii) / and the mixed partial derivatives from (i) arc of polynomial growth in x and y, 
uniformly for re [0,1] and 8 in compact, convex sets. 


Similarly, the classes C*^°/([0, I ] x A' x 0), C^°*(<Y 2 x 0), C^Va x 0) and Cg°*(A) are defined 
for functions of the form fit, x; 8), f(y, x; 8), f(y; 8) and f(y), respectively. o 


Note that in Definition |2.1[ differentiability of / with respect to x is never required. 

For the duration of this paper, R(t,y,x;ff) denotes a generic, real-valued function defined 
on [0,1] x X 2 x 0, which is of polynomial growth in x and y uniformly for re [0,1] and 
8 in compact, convex sets. The function R(t,y, x; 8) may depend (implicitly) on 9q. Func¬ 
tions R(t, x; 8), R(y, x; 8) and R{t, x) are defined correspondingly. The notation R\{t, x; 8) 
indicates that R{t, x; 8) also depends on A e 0 in an unspecified way. 


2.3 Approximate Martingale Estimating Functions 

Definition 2.2. Let g(t, y, x; 8) be a real-valued function defined on [0,1] x X 2 x 0. Suppose 
the existence of a constant k > 2, such that for all n e N, / = 1,...,«, 0 e 0, 


E e (g(A „, X ?, X , u ; 8) \ X , u ) = A K n R e (A n , X , u ). 


( 2 . 2 ) 


Then, the function 

n 

G„{8) = J]g(A n ,X t n,X tli -8) (2.3) 

i=i 

is called an approximate martingale estimating function. In particular, when ( |2.2[ ) is satis¬ 
fied with R/i(t, x) = 0, ( |2.3[ ) is referred to as a martingale estimating function. o 

By the Markov property of X, it follows that if Rg(t, x) = 0, then (G, M ) i <,■<„ defined by 


G n , i (8) = J]g(^n,X t n,Xr ] _ i ;8) 

7=1 


is a zero-mean, real-valued (^„ ,)i</<„-martingale under ¥'o for each n e N. The score func¬ 
tion of the observations (X,», X t «,..., X t «) is a martingale estimating function under weak 
regularity conditions, and an approximate martingale estimating function can be viewed as 
an approximation to the score function. 


A G n -estimator 8 n is essentially obtained as a solution to the estimating equation G n (8) = 0. 
A more precise definition is given in the following Definition 2.3 Here we make the co¬ 
dependence explicit by writing G„{8, a>) and 8 n (co). 


Definition 2.3. Let G n {8 , oj) be an approximate martingale estimating function as defined 
in Definition |2.2| Put 0 CX) = 0 U {ooj and let 


D„ = {co e £2 | G n (8, co) = 0 has at least one solution 8 e 0}. 
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A G n -estimator 9 n (a>) is any ^-measurable function Q —» 0 M which satisfies that for p e 0 - 
almost all a> , 9 n {aj) e 0 and G n (9 n (u>), a>) = 0 if to e D n . while 0 n (oj) = oo if oj £ D n . o 

For any M n + 0, the estimating functions G„(9) and M n G n {6) yield identical estimators of 
6 and are therefore referred to as versions of each other. For any given estimating function, 
it is sufficient that there exists a version of the function which satisfies the assumptions of 
this paper, in order to draw conclusions about the resulting estimators. In particular, we can 
multiply by a function of A„. 


2.4 Assumptions 

We make the following assumptions about the stochastic differential equation. 

Assumption 2.4. The parameter set 0 is a non-empty, open subset o/R. Under the prob¬ 
ability measure Pg, the continuous, ('Ft)i>o-adapted Markov process X = (X,) t >o solves a 
stochastic differential equation of the form (2.1 ), the coefficients of which satisfy that 

a(y) € C p “' (A) and b(y\9) e (A x 0) . 

The following holds for all 9 e 0. 

(i) For all ye A, b 2 (y\ 9) > 0. 

(ii) There exists a real-valued constant Cq > 0 such that for all x, y e A, 

I a{x) - «(y)l + I b(x-, 9) - b(y ; 9)\ <C e \x-y\. 


(iii) U has moments of any order. 


o 


The global Lipschitz condition, Assumption 2.4|(ii)[ ensures that a unique solution X exists. 
The Lipschitz condition and (iii) imply that sup fe[0 , | Eg (\X,\ m ) < oo for all m e N. Assump- 
tion|2.4|is very similar to the corresponding Condition 2.1 of Sprcnscn (2010). However, 


an important difference is that in the current paper, X is not required to be ergodic. Here, 
law of large numbers-type results are proved by what is, in essence, the convergence of 
Riemann sums. 


We make the following assumptions about the estimating function. 

Assumption 2.5. The function g(t,y, v; 9) satisfies ( | 2.2\ for some k > 2, thus defining an 
approximate martingale estimating function by ( | 2.3\ . Moreover, 

g(t,y, x; 9) e C p °% 2 ([0, 1] x A 2 x 0), 


and the following holds for all 9 e 0. 

(i) For all x e A, d y g( 0, x, x; 9) - 0. 
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(ii) The expansion 

g(A,y, x; 6) = g(0,y, x; 9) + Ag (1) (y, x; 6) + ^A 2 g (2) (y, x; 6) + gA 3 g (3) (y, x; 9) 
+ A 4 R{A,y,x\ 9) 


(2.4) 


holds for all A e [0,1] and x,y 6 X, where g^Hy, x; 6) denotes the fth partial 
derivative of g(t,y, x; 6) with respect to t, evaluated in t = 0. 


Assumption |2.5|(i) was referred to by Sp ten sen (20101 as Jacobsen’s condition, as it is 


one of the conditions for small A-optimality in the sense of Jacobsen (2001), see Jacobsen 


(2002). The assumption ensures rate optimality of the estimators in this paper, and of the 


estimators of the parameters in the diffusion coefficient in Sprcnsen (2010). 


The assumptions of polynomial growth and existence and boundedness of all moments 
serve to simplify the exposition and proofs, and could be relaxed. 


2.5 The Infinitesimal Generator 


For A € 0, the infinitesimal generator X.i is defined for all functions f(y) 6 C^ ol (A) by 

-Cl/00 = aty)d y f(y) + \b 2 (y,A)d 2 f (y). 


For f(t,y, x, 0) e C^ 1 0 0 ([0,1] x A 2 x 0), let 


-C,if(t,y, x; 0) - a{y)d y f(t,y, x; 6) + \b 2 {y\ A)dyf(t,y, x; 6). 


(2.5) 


Often, the notation J2\f(t,y, x; G) = L,{( f(t\9))(y, x) is used, so e.g. -Ci(/(0; 9))(x, x) means 
-Ci/(0, y, x; 6) evaluated in y = x. In this paper the infinitesimal generator is particularly 
useful because of the following result. 


Lemma 2.6. Suppose that Assumption 2.4 holds, and that for some k e No, 

a(y) € C P 2 ° k l (X), b(y, 6) e cg 0 (* x 0) and fly, x; 9) e C'^ JX 2 x 0). 
Then, for 0 < t < t + A < 1 and A e 0, 

B A (f(X t+A ,X t -,6)\X t ) 

* A i M ru\ ruk 

= \yZ l A f(X„ x u0)+ ••• E A (£ k l +1 f(X t+Uk+l ,X t -G)\X t )du k+l ---du l 

■ q Jo Jo Jo 

where, furthermore, 

X A r*U i f~* u k 

Jo " ' Jo E ' 1 ( £ * +lf(Xt+u » 1 ’ Xt; e) 1 Xt ) dllk+i ■'' dlil = Ak+lR ^ A ’ 9 ) • 


o 

















The expansion of the conditional expectation in powers of A in the first part of the lemma 


corresponds to Lemma 1 in Florens-Zmirou (1989) and Lemma 4 in Dacunha-Castelle and 


Florens-Zmirou (1986). It may be proven by induction on k using Ito’s formula, see, e.g., 
the proof of Sprensen ( 2012j , Lemma 1.10). The characterisation of the remainder term 
follows by applying Corollary 


A.5 


to see the proof of 


Kessler 


(1997 Lemma 1). 


For concrete models, Lemma [276] is useful for verifying the approximate martingale prop¬ 


erty (2.2 1 and for creating approximate martingale estimating functions. In combination 


with (2.2 1 , the lemma is key to proving the following Lemma 2.7 which reveals two im¬ 
portant properties of approximate martingale estimating functions. 

Lemma 2.7. Suppose that Assumptions \2.4\ and \2~5\ iwld. Then 

g( 0, x, x; 9) = 0 and g m (x, x; 9) = -£ e (g(0,6))(x, x) 

for all x e X and 6 e 0. o 


Lcrmna [2~71 corrcsponds to Lemma 2.3 of Sprensen (20101, to which we refer for details on 
the proof. 


3 Main Results 

Section [3TT| presents the main theorem of this paper, which establishes existence, unique¬ 
ness and asymptotic distribution results for rate optimal estimators based on approximate 


martingale estimating functions. In Section 3.2 a condition is given, which ensures that the 


rate optimal estimators are also efficient, and efficient estimators are discussed. 

3.1 Main Theorem 

The final assumption needed for the main theorem is as follows. 

Assumption 3.1. The following holds P g-almost surely for all 0 e ©. 

(i) For all A T 6, 


f (b 2 (X s ; 9) - b 2 (X s - A))S 2 g( 0, X s , X s ; A) ds 
Jo 


* 0 . 


(ii) 


f 


dgb 2 {X s - 9)dzg{ 0, X s , X s ; ff)ds± 0, 


(in) 


C 1 0 

J o b\x s - 9) (d 2 g( 0, X s , X s ; 9))' dsF 0 . 
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Assumption |3.1| can be difficult to check in practice because it involves the full sample path 
of X over the interval [0,1], It requires, in particular, that for all 6 e 0, with Pg-probability 
one, t i—* b 2 (X t \ 8) - b 2 (X t \ A) is not Lebesgue-almost surely zero when A T 0. As noted 
by Genon-Catalot and Jacod (1993), this requirement holds true (by the continuity of the 
function) if, for example, Xq = U is degenerate at xq, and b 2 {x o; 6) + b 2 {x o; A) for all 0 T A. 


For an efficient estimating function, Assumption |3. 1 | reduces to conditions on X with no fur¬ 
ther conditions on the estimating function, see the next section. Specifically, the conditions 
involve only the squared diffusion coefficient b 2 (x', 8) and its derivative deb 2 . 


Theorem 3.2. Suppose that Assumptions \2.4\ \2.5\ and \3.1\ hold. Then, 

(i) there exists a consistent G n -estimator 8 n . Choose any compact, convex set K c 0 
with ft() € int K, where bit K denotes the interior of K. Then, the consistent Gn- 
estimator G n is eventually unique in K, in the sense that for any G n -estimator 8 n with 
P e a (0 n e K) —> 1 as n —> oo, it holds that P o 0 (6 n + 0 n ) —> 0 as n —» oo. 

(ii) for any consistent G n -estimator 6 n , it holds that 

yfn(6 n - d 0 ) —> W(9q)Z . (3.1) 


The limit distribution is a normal variance-mixture, where Z is standard normal 
distributed, and independent of W(6 q) given by 


W(8 0 ) = 


\b 4 (X s \ 8 0 ) (d 2 y g( 0, X s , X s ; 9 0 )) 2 ds 

f 1 1 d e b 2 (X s -8o)d 2 y g(0,X s ,X s -,8 0 )ds 
Jo 


1/2 


(3.2) 


( Hi) for any consistent G n -estimator 8 n 


1 

-Yjg 2 (A n ,X,»,X tli -,8 n ) 


1/2 


w„ = — 


(=1 


Y.degi \uX fr X fi -X) 

i=i 


(3.3) 


— p 

satisfies that W n —» W(8f), and 


x/fi wf\e n - e 0 ) N{ o, i). 




The proof of Theorem 3.2 is given in Section 5.1 
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Dohnal (1987) and Gobet (2001) showed local asymptotic mixed normality with rate s/n, 


so Theorem 3.2 establishes rate optimality of G n -estimators. 

Observe that the limit distribution in Theorem 3.2|(ii)| generally depends on not only the 
unknown parameter 9q, but also on the concrete realisation of the sample path t h I ( 
over [0,1], which is only partially observed. Note also that a variance-mixture of normal 
distributions can be very different from a Gaussian distribution. It can be much more heavy¬ 
tailed and even have no moments. Theorem 3.2|(iii) is therefore important as it yields a 
standard normal limit distribution, which is more useful in practical applications. 


3.2 Efficiency 

Under the assumptions of Theorem |3.2| the following additional condition ensures effi¬ 
ciency of a consistent G n -estimator. 


Assumption 3.3. Suppose that for each 0 e 0, there exists a constant Kg + 0 such that for 
all x e X, 


dyg( 0, x, 9) = K e 


dgb 2 (x\ 6) 
b 4 (x] 9) 


o 


Dohnal ([1987) and Gobet (2001) showed that the local asymptotic mixed normality prop¬ 


erty holds within the framework considered here with random Fisher information I{9f) 
given by ( |l.2| . Thus, a G„-estimator 9 n is efficient if (3.11 holds with W(9q) - I(9q)~ 1 ^ 2 , 
and the following Corollary |3.4|may easily be verified. 


Corollary 3.4. Suppose that the assumptions of Theorem 3.2 and Assumption 3.3 hold. 
Then, any consistent G n -estimator is also efficient. o 


It follows from Theorem 3.2 and Lemma |5T| that if Assumption |3.3| holds, and if G n 
normalized such that Kg - 1, then 


is 


sJhlf(6 n -6 0 ) N ( 0 , 1 ), 


where 



(=i 


It was noted in Section 2.3 that not necessarily all versions of a particular estimating func¬ 
tion satisfy the conditions of this paper, even though they lead to the same estimator. Thus, 
an estimating function is said to be efficient, if there exists a version which satisfies the 
conditions of Corollary |3.4| The same goes for rate optimality. 


Assumption|3.3|is identical to the condition for efficiency of estimators of parameters in the 


diffusion coefficient in 

Sfirensen 

(2010 

), and to one of the conditions for small A-optimality 

given in Jacobsen 

(2002 

). 
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Under suitable regularity conditions on the diffusion coefficient b, the function 

d e b 2 (x\ 6) 


g(t,y, x\ 0) = 


■ ((y - x f - tb 2 ix -, 6 ')) 


(3.4) 


b 4 (x\ 8 ) 

yields an example of an efficient estimating function. The approximate martingale property 
\2.2\ can be verified by Lemma [2T6| 


When adapted to the current framework, the contrast functions investigated by Genon 


Catalot and Jacod (1993) have the form 


1 " 

Un(8) = - X f ( b2(Xl h ; ' /2 (W - X tU )) , 

I 

1 = 1 

for functions f(v, w ) satisfying certain conditions. For the contrast function identified as 
efficient by Genon-Catalot and Jacod f(y,w ) = logv + w 2 /v. Using that A„ = 1/n, it is 


then seen that their efficient contrast function is of the form U n (8) = X" = i w(A„, X,«, X,« | ; 8) 
with 


u(t, y, x\8) = t log b z (x\ 8) + (y - x) 2 /b 2 (x\ 8) 
and dgu(t,y, x; 8) = -g(t,y, x; 8). In other words, it corresponds to a version of the efficient 


approximate martingale estimating function given by (3.4). The same contrast function 


was considered by Uchida and Yoshida (2013 ) in the framework of a more general class of 
stochastic differential equations. 

A problem of considerable practical interest is how to construct estimating functions that 
arc rate optimal and efficient, i.e. estimating functions satisfying Assumptions 2.5|(i) and 
Being the same as the conditions for small A-optimality, the assumptions are, for 


3.3 


example, satisfied for martingale estimating functions constructed by Jacobsen (2002). 

As discussed by Sprensen (j2010i, the rate optimality and efficiency conditions are also sat¬ 


isfied by Godambe-Heyde optimal approximate martingale estimating functions. Consider 
martingale estimating functions of the form 


git, y, x; 8) = a(x, t; 8)* (/(y; 8) - (f>gf(x; 8 )) , 


where a and / are two-dimensional, * denotes transposition, and infix', 8) = E eif{X t \8) \ 
Xq = x). Suppose that / satisfies appropriate (weak) conditions. Let a be the weight 


function for which the estimating function is optimal in the sense of Godambe and Heyde 
(1987]), see e.g. Heyde (19971 or Sprensen ( 2012| Section 1.11). It follows by an argument 
analogous to the proof of Theorem 4.5 in Sprensen (2010|) that the estimating function with 


git, y, x-, 8) = taix, t; 8)* [f(y\ 8) - (p r 6 fix\ 6)] 


satisfies Assumptions 2. ag and |3.3[ and is thus rate optimal and efficient. As there is a 


simple formula for a (see Section 1.11.1 of Sprensen (2012)), this provides a way of con¬ 
structing a large number of efficient estimating functions. The result also holds if 0l/(x; 8) 
and the conditional moments in the formula for a are suitably approximated by the help of 
Lemma I2T61 
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Remark 3.5. Suppose for a moment that the diffusion coefficient of ( |2. 1 [ ) has the form 
b 2 (x\ 9) = h(x)k(9) for strictly positive functions h and k, with Assumption 2.4 satis¬ 
fied. This holds true, e.g., for a number of Pearson diffusions, including the (stationary) 
Ornstein-Uhlenbeck and square root processes. (See Forman and Sprensen ( |2008j ) for more 
on Pearson diffusions.) Then I(9o) = dgk(6o) 2 /(2k 2 (%)). In this case, under the assump¬ 
tions of Corollary 3.4 an efficient G n -estimator 9 n satisfies that xfn(6„ - 9o) —> Y in dis¬ 
tribution where Y is normal distributed with mean zero and variance 2k 2 (9o)/dgk(9o) 2 , i.e. 
the limit distribution is not a normal variance-mixture depending on (X,) /£ [o,i]. Note also 
that when b 2 (x\9) = h(x)k(9 ) and Assumption |3.3| holds, then Assumption |3.l| is satisfied 
when, e.g., dgk(9) > 0 or dgk(9) <0. o 


4 Simulation study 


This section presents a simulation study illustrating the theory in the previous section. An 
efficient and an inefficient estimating function are compared for two models, an ergodic and 
a non-ergodic model. For both models the limit distributions of the consistent estimators 
are non-degenerate normal variance-mixtures. 

First, consider the stochastic differential equation 

dX, = -2X, dt + {9 + Xfy 112 dW t , X 0 = 0, (4.1) 


where 9 e (0, oo) is an unknown parameter. The solution X is ergodic with invariant prob¬ 
ability density proportional to exp {-29x 2 - x 4 j {9 + x 2 ), x £ R. The process satisfies As¬ 
sumption 2.4 We compare the two estimating functions given by 

n n 

G n (9) = ^ g(A n ,X f; ,X tU ; 9) and H n (9) = £ h(A n , X , ; , ; 9) 


i=i 


i= 1 


where 


g(t,y, x; 9) = (y - (1 - 2t)xf -(9 + x 2 ) 't 

h(t, y, x; 9) = (0 + x 2 ) 10 (y - (1 - 2t)xf - {9 + x 2 ) 9 t. 


Both g and li satisfy Assumptions 2.5 and 3.1 Moreover, g satisfies the condition for 
efficiency, while h is not efficient. 


Let Wg(9o ) and Wh(9o ) be given by (3.2 1 , that is 


w G m = 


j i r 1 ^ 

V 2 Jo (9o + 


X 2 ) 2 


- 1/2 

ds) and Wh(9q ) - - 


I 


1 x 1/2 

2(00 + x 2 ) 18 ds 


f 


. (4.2) 


(0 O + X 2 ) 8 ds 


Numerical calculations and simulations were done in R 3.1.3 (R Core Team 2014). First, 
m = 10 4 trajectories of the process X given by (4.1) were simulated over the time-interval 
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Figure 1: QQ-plots comparing Z Gn (left) and Z/y „ (right) to the N(0, 1) distribution in the 
case of the ergodic model (4.1 1 for n = 10 3 (above) and n = 10 4 (below). 


[0,1] with 9q = 1. These simulations were performed using the Milstein scheme as im¬ 
plemented in the R-package sde (Iacus 2014) with step size 10 5 . The simulations were 
subsampled to obtain samples sizes of n = 10 3 and n = 10 4 . Let 9 G and 9n,n denote 
estimates of 9 obtained by solving the equations G n (9) = 0 and H n {9) = 0 numerically, on 


the interval [0.01,1,99]. Using these estimates, Wc, n and Wu,n were calculated by (3.3 1 . 
For n - 10 3 , On,, could not be computed for 30 of the m - 10 4 sample paths. For n = 10 4 , 
and for the efficient estimator there were no problems. 

Figure [T] shows QQ-plots of 


Z G ,n = y/nW G l J9 G „-9o) and Z Hj 


-,/TT TI/~t 


compared with a standard normal distribution, for n = 10 3 and n = 10 4 respectively. These 
QQ-plots suggest that, as n goes to infinity, the asymptotic distribution in Theorem |3. 2 |(iii) 
becomes a good approximation faster in the efficient case than in the inefficient case. 
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Figure 2: Approximation to the densities of Wg(Q o) (left) and WuWo) (right) based on Wq 


and Wh in the case of the ergodic model (4.1). 


Inserting do = 1 into (4.2 1 , the integrals in these expressions may be approximated by 
Riemann sums, using each of the simulated trajectories of X (with sample size n = 10 4 
for maximal accuracy). This method yields a second set of approximations Wg and Wh to 
the realisations of the random variables Wg{Go) and Wh(0q), presumed to be more accurate 
than W G yo 4 an< i W//,io 4 as they utilise the true parameter value. The density function in R 
was used (with default arguments) to compute an approximation to the densities of Wg(Go) 
and Wh{Q o), using the approximate realisations Wq and Wh- 

It is seen from Figure [2] that the distribution of Wh(G o) is much more spread out than the 
distribution of Wg{Go)- This corresponds well to the limit distribution in Theorem 3.2|(ii) 
being more spread out in the inefficient case than in the efficient case. Along the same lines, 
Figure [5] shows similarly computed densities based on - $ 0 ) and Vn(0#,« ~ # 0 ) for 

n = 10 4 , which may be considered approximations to the densities of the normal variance- 
mixture limit distributions in Theorem 3.2|(ii)| These plots also illustrate that the limit 
distribution of the inefficient estimator is more spread out than that of the efficient estimator. 


Now, consider the stochastic differential equation 

dX, = 2X, dt + {G + Xjr l/1 dW„ X 0 = 0. (4.3) 

For this model, the solution X is not ergodic, but again Assumption |2.4| holds. We compare 
the two estimating functions given by 

g(t,y,x\ 0) = (y - (1 + 2 t)xf - (6 + ,r 2 ) _1 t 
h(t,y, x-, G) = (0 + .r 2 ) 10 (y - (1 + 2 t)xf - (0 + x 2 ) 9 t. 

For both g and / Assumptions |2.5|and|3.1|hold, and g is efficient, while h is not. 
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Figure 3: Estimated densities of 1 ~ $o) (solid curve) and \/n(i9//„ - 9q) (dashed 

curve) for n = 10 4 in the case of the ergodic model (4.1). 


Simulations were carried out in the same manner as for the ergodic model. In the non- 
ergodic case, an estimator was again found for every sample path when the efficient es¬ 
timating function given by g was used. For the inefficient estimating function given by 
h, there was no solution to the estimating equation (in [0.01,1.99]) in 14% of the sam¬ 
ples for n = 10 4 and in 39 % of the samples for n = 10 3 . Figure [ 4 ] shows QQ-plots of 
Z(} n = W G l n (0(, n - 9 0 ) and Z/ / n = yjn Wj/jOiiji - $o) compared with a standard normal 
distribution, for n = 10 3 and n = 10 4 respectively. These QQ-plots indicate that in the 
non-ergodic case there is a slightly slower convergence to the asymptotic distribution in 
Theorem 3.2|(iii) for the efficient estimating function, and a considerably slower conver¬ 
gence for the inefficient estimating function, when compared to the ergodic case. 
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Figure 4: QQ-plots comparing Z, 


G,n 


case of the non-ergodic model (4.3) 


(left) and Z//„ (right) to the N(0, 1) distribution in the 
for n = 10 3 (above) and n = 10 4 (below). 
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5 Proofs 


Section [ST!] states three main lemmas needed to prove Theorem 1 3. 2} followed by the proof 
of the theorem. Section[5]2] contains the proofs of the three lemmas. 


5.1 Proof of the Main Theorem 


In order to prove Theorem 3.2 we use the following lemmas, together with results from 
Jacod and Sprensen ( 2012 1, and S0rensen| ( |2O12[ Section 1.10). 


Lemma 5.1. Suppose that Assumptions \2.4\ and \2~5\ lwld. For f) e 0, let 

11 

G n {G) = Y J g(.K,x t »,x tU -,e) 

i= 1 

1 " 

G n q (0) = 

A " /=1 

and 

A(0 ; Go) = lf Q ( b2 ( x d, ~ b 2 (X s - G)) d 2 y g( 0, X„ X s ; 6) ds 

B(G ; Go) = \ £ ( b 2 (X,; 6 0 ) - b 2 (X,; 0)) d 2 d e g( 0,X„ X s ; 0) ds 

- 3 f d e b 2 (X s -G)d 2 g(0,X s ,X s -G)ds 
Jo 

C(G; G 0 ) = \§ ( b4 (Xp, G 0 ) + \ (b 2 {X s - Go) - b 2 {X s ; 0)) 2 ) (d 2 g(0, X„ X s ; d)) 2 ds. 

Then, 

(i) the mappings G i—* A(G',6o), G i—> B(G\6o) and G i-» C{6\Go) are continuous on 0 
(IPe,,-rdmost surely) with A(Go', Go) = 0 nnd dgA(G', Go) = B(G; Go). 

(ii) for all t e [0,1], 

[«/] 


—j= ^ |®0 O (gO^.X^X^; Go) I j)| —» 0 
V i— i 

I M 

— 2 (E eo (giAn^X^Oo) | X^)) 


2 p 


i=l 


. [«f] 

- 2] E eo (g 4 (A„, , X tU ; do) I X,. J) 0 


A » It 


(5.1) 

(5.2) 

(5.3) 


and 


1 ^ w 2 

— Y’E flb (g 2 (A II ,X ? ,X ?i ;ft)|X ? J-^ | Z7 4 (X. s ;0o)(d 2 g(O,X,,X s ;0 o )) 2 ds. 
A« ( ._j ' ' ' J() 
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(5.4) 












(iii) for all compact, convex subsets K c 0, 


sup \G„(G) - A{d\ 0 O )| —> 0 

6eK 

sup \d e G n (6) - B(9-9 0 )\ 0 

6eK 


sup \G s n \0)-C(6-,0 0 )\^0. 

6eK 


O 


Lemma 5.2. Suppose that Assumptions \2.4\and\2~5\hold. Then, for all t e [0,1], 


[nt] 


~^= Ee 0 (g(A„, X f; , X , u ; 0o)(Wc - W, u ) \ T , u ) 

V A n f— j 


9 


o. 


(5.5) 


Lemma 5.3. Suppose that Assumptions \2.4\and\275\hold, and let 


[«fl 


Y n ,t = -^YjgiA^X^X^do). 

V i— j 


Then the sequence of processes (Y n ) n eY< given by Y n = (Y nt ) te \o\ \ converges stably in 
distribution under Pg 0 to the process Y = (T r ) f6 [0,i] given by 

Y t = ^ J b 2 (X s ; d 0 )d 2 g(0 , X s , X,; 9 0 ) dB s . 

Here B = {B s ) s >o denotes a standard Wiener process, which is defined on a filtered exten¬ 
sion {Q! ,T', CT')r>{), P' 0(> ) of (il, T, (fF t )t> o, Pg 0 ), and is independent of ((J. W). o 

v s , 

We denote stable convergence in distribution under Pg 0 as n —» oo by — 


Proof of Theorem \3.2\ Let a compact, convex subset K c @ with do 6 int K be given. The 
functions G n (d), A(d, do), B{6, do), and C(d, do) were defined in Lemma [571] 


By Lemma 5.1 |(i) and (iii) 


G n (6 0 ) 0 and sup \d e G n (6) - B{9, 0 O )| 0 

6eK 


(5.6) 


with B(do‘, do) T 0 by Assumption 3. TTdol so G n {d) satisfies the conditions of Theorem 1.58 
in i 


Sprensen (2012). 


Now, we show (1.161) of Theorem 1.59 in Sprensen (2012). Let s > 0 be given, and 
let B E (do) and Bfido), respectively, denote closed and open balls in R with radius s > 0, 
centered at do . The compact set K\B, : (do) does not contain do, and so, by Assumption 
3.1 |(i) A(9, do) d 0 for all d e K\B £ (do) with probability one under Pg 0 . 
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Because 


inf | A(9, d 0 )l > inf \A(0,9 0 )\ > 0 

8eK\B s (8 o) 8eK\B E (8 0 ) 


Pe 0 -almost surely, by the continuity of 0 i-» A(9, do), it follows that 


00 


inf \A{9, 0 O )I > 01 = 1. 

8eK\B s (8o) 


Consequently, by Theorem 1.59 in Sprensen (2012), for any G n -estimator 9 n , 

Pe 0 ( 9 n e K\B e (6o )) -» 0 as n -> oo . (5.7) 


for any s > 0. 


By Theorem 1.58 in Sprensen (2012), there exists a consistent G „-estimator 9 n , which is 
eventually unique, in the sense that if 9 n is another consistent G„-estimator, then 


00 


(e n # d„) -> 0 as 


n —* oo . 


Suppose that 6 n is any G n -estimator which satisfies that 

P eo (e n e k) -> 1 


Combining (5.7 1 and (5.91, it follows that 


0 o 


(On € B e (0 o)) -» 1 


as n : —» oo . 


as n —> co , 


(5.8) 


(5.9) 


(5.10) 


so is consistent. Using ( |5.8[ ), Theorem |3.2|(i) follows. 

To prove Theorem 3.2|(ii)[ recall that A„ = 1 /n, and observe that by Lemma [53] 


£>s, 


where 


S{9 o) = 


^iG n (9 0 ) -A S(d 0 ) 


£ -^b 2 (X s ; 0 o )d 2 y g(O, X s , X s ; d 0 ) dB,, 


(5.11) 


and B = (B s ).se[0,i | is a standard Wiener process, independent of (U, W). As X is then also 
independent of B, .S' (do) is equal in distribution to C(do; do) 1 /2 Z, where Z is standard normal 
distributed and independent of (7Q fg [o,i]. Note that by Assumption 3. Tpl)| the distribution 
of C(d 0 ; do) l/2 Z is non-degenerate. 

Let 9 n be a consistent G,,-estimator. By ( |5.6| , ( |5. 1 I [ ) and properties of stable convergence 
(e.g. (2.3) in |Jacod| ( |1997| )), 

^G n (9 0 )\ D„ ( S(9 0 ) 


d e G n {9 o)J Wo; do) 
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Stable convergence in distribution implies weak convergence, so an application of Theorem 


1.60 in Sprensen (2012) yields 


V^(0« - 0 o ) -B(8o, 8 0 r l S{6o). 


(5.12) 


The limit is equal in distribution to W(9q)Z, where W{9q) = -B(9o,9q)~ 1 C(8q,9q) 1 I 2 and 
Z is standard normal distributed and independent of W(#o)- This completes the proof of 
Theorem |3.2|(ii) 


Finally, Lemma 2.14 in Jacod and Sprensen (2012 1 is used to write 


Vn(8 n - 9 0 ) = -B(8 0 \ 9 0 ) 1 fnG n {9 0 ) + fn\G n - 9 0 \s n (8 0 ), 
where the last term goes to zero in probability under p 0o- By the stable continuous mapping 


theorem, (5.12 1 holds with stable convergence in di stribution as well. Lemma 5.1 |(iii) may 


p 


be used to conclude that W„ —> W(9q), so Th core m |3.2|(iii)| fol I ows from the stable version 
of (5.12|) by application of standard results for stable convergence. □ 


5.2 Proofs of Main Lemmas 


This section contains the proofs of Lemmas |5.1||5.2| and 5.3 in Section [571] A number of 
technical results are utilised in the proofs, these results are summarised in Appendix [Aj 
some of them with a proof. 

Proof of Lemma |5.7| First, note that for any f(x\ 8) e Cj^LYxQ) and any compact, convex 
subset K c 0, there exist constants Ck > 0 such that 

\f(X s -,9)\<C K (l + \X s \ C «) 

for all s e [0,1] and 9 e int K. With probability one under Pg 0 , for fixed co, Ck( I +\X s (oj)\ C k ) 
is a continuous function and therefore Lebesgue-integrable over [0,1], Using this method 
of constructing integrable upper bounds, Lemma 5.1 |(i)| follows by the usual results for 
continuity and differentiability of functions given by integrals. 


In the rest of this proof, Lemma A.3 and (A.7 1 are repeatedly used without reference. 


First, inserting 9 = 9q into (A. 1 1 , it is seen that 


1 \nt] [nf] 

—= Y K * x t, ; i = A » /2 Y R (**’ x s-i ; 


P 

; 6 > 0 ) —> 0 


[«fl 


i= 1 

[n t] 


Y K A « > x '- ’ x ‘-,; i, )) 2 = A » Y R{A " • Xi ",; 0 o) o, 

An i=i ' ' 1=1 

proving ( |5.1| ) and (5.2 >. Furthermore, using ( |A.1| ) and (A.3 1 , 

M,e a yg^ n ,X tl ,X t n_ i -6)\X tU y ,P 

i=l 


Y (g(An, Xf ,X tU -8)\ X , u ) A(0; 9 0 ) 
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P 


Y J ^o 0 (8 2 (A^t';,X tU -,e)\X tU ) 

i-1 


0 , 


so it follows from Lemma A. 1 that point-wise for 8 € 0, 


p 


G„(0)-A(8;0o)—>0. 


(5.13) 


Using ( |A.3p and ( |A.5[ ), 

[»t] 

A' 


i= 1 
P 


and 


^Y j Be 0 (g\An,X fl ,X f f-,e)\X fl 2 

\ £ (b 4 (X s ; 0q) + \ (b 2 (X s ; 9q) - b\X s [ 0)) 2 ) (d 2 g( 0, X s ,X s ; 0)) 2 rA 

1 ^ <p 

— ^ E eo (g 4 (A„, X,,, X^; 8) | X^) — 0, 

completing the proof of Lemma 5.1|(ii) when 8 = 9q is inserted, and yielding 

G s n \8)-C{8-,0 o)^0 (5.14) 

point-wise for 0 e 0 by Lemma A.l[ when t = 1 is inserted. Also, using (A.2) and (A.4 1 , 

n 

2 E A , Xf> , X^; 8) \ X^) 5(0; 0 O ) 

i=l 

n 

2 E eo ((d 0 g(A,„X, r ,X t| ; 0)) 2 | X^ j 0. 


Thus, by Lemma A. 1 also 


p 


d e G n (8) - B(8-,8 0 ) — >0, 


(5.15) 


point-wise for 9 e 0. Finally, recall that d J y g(0,x,x\9) = 0 for j = 0,1. Then, using 
Lemmas A.7 and A.8[ it follows that for each m e N and compact, convex subset K c 0, 
there exist constants C m k > 0 such that for all 9,9' e K and n e N, 


Ee 0 \(G n (9) - A(0 ; 0 O )) - ( G„(0’) - A(0'; 0 O ))\ 2m < C mX \0 - 9'\ 2m 
Eg 0 \(d e G n (8) - 5(0; 8 0 )) - (d e G n (9') - 6(0'; 0 O ))\ 2m < C m , K \0 - d'\ 2m 
E eo \(G s n \d) - C(0; 0b)) - (Gf(d') - C(0'; 0<>))| 2m < C m , K \0 - 9'\ 2m . 


(5.16) 


By Lemma |5.1|(i)[ the functions 8 G„{9) - A(0;0o), 9 dgG n (9) - B(9\9q) and 9 i-» 

G S n(9) -C(9, 9 q) are continuous on 0. Thus, using Lemma [Al together with (5.13), (5.14 1 , 
(5.15) and (|5. 16) completes the proof of Lemma|5.1|(iii)| □ 
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Proof of Lemma |5.2| The overall strategy in this proof is to expand the expression on the 
left-hand side of (|5.5|) in such a manner that all terms either converge to 0 by Lemma A.3 


or are equal to 0 by the martingale properties of stochastic integral terms obtained by use 
of Ito’s formula. 

By Assumption |2.5 1 and Lemma 2.7 the formulae 

8(0,y, x; 0) = |(y - x) 2 d 2 g( 0, x, x; 6) + (y - x) 3 R(y, x; 9) 
g U) (y, x; 9) - g a \x, x; 9) + (y - x)R(y, x; 9) 

may be obtained. Using (2.41 and ( |5.17[ ), 

E e „ (g(A n ,Xf,Xf-,9 0 )(Wf - Wfj | Tf) 


(5.17) 


E 0o 

+ ] 

+ 

+ 


+ A 


(\(X,n - X tU ) 2 d 2 g(0, X lU , X tU ; 9 0 )(Wf - Wfj \ Tf) 
E 0O ((X,* - X tl] ) 3 R(Xf , X, u ; G 0 )(W t * - W, u ) \ Tf) 
A„E 0O (g (1) (A r;i - Wfj \ Tf) 

A n E 0o ((X f; - X, u )R(X fh X tU ■ G () )(Wf - W, u ) \ Tf) 
(R(A n ,X f; ,X tli -,9 0 )(W f; - Wfj | Tf) . 


(5.18) 


-00 I 


Note that 

A ng a \X fU ,X f[ -,9 0 )B eo (Wf - W fU | Tf ) = 0, 

and that by repeated use of the Cauchy-Schwarz inequality, Lemma A.4 and Corollary |A.5| 

E 0o ((Xf - X tU ) 3 R(Xf,X , u ; 9 0 )(Wf - Wfj \ Tf) | < A 2 C(1 + |%J C ) 

A„ |E 0O ((Xf - X? u )R(Xf , X fU ; 9 0 )(Wf - ) | Tf )| < A 2 C( 1 + 1*^ | c ) 

A 2 k 0 (R(A n ,Xf,Xf -,9 Q )(Wf - Wf J \ Tf )I < Af/ 2 C(1 + | Xff) 


for suitable constants C > 0, with 


[nt] 


-±=YA: /2 C(l + \Xf\ c )^0 

jri 


for m = 4,5 by Lemma A.3 Now, by (5.18 1 , it only remains to show that 
[«q 

— ^ d 2 y g( 0, X , u , X tU ■ 9o)Eg 0 ((Xf - X tU f(Wf - W, u ) | Tf)- 

V i~ i 

Applying Ito’s formula with the function 


9 


0. 


(5.19) 


f(y, w)-(y- Xf a ) 2 (w - Wf x ) 
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to the process (X t , W,),>,>< i , conditioned on (X t >< | , W,>> | ) = (x t <> , w*? ), it follows that 


(X t n~X t n Y(W t n-W t n ) 

l l- 1 / t-1 


= 2 f'(X, 

Jt ", 


X tU ){W s - W t? i )a(X s )ds + 


f" (W, 

Jr" , 


W t n i )b\X s -,6 0 )ds 


i-l i -1 

r f " r'" 

+ 2 '(X, -^ )WX s ;0 o )d.s' + 2 ' (X, - X,« )(IV, - )b(X s - 0 o )dW s 

Jt 11 , Jt ", 

i-l /-I 

r t" 

+ 1 (X s -X t n) 2 dW x . 

Jt" , 


By the martingale property of the Ito integrals in ( |5.20[ ), 

E g 0 ((X t n - X^fiWf! - W tU ) | T tl] ) 

X t" 

' E eo ((X, - X, U )(W S - W, ; Ja(X s ) \ T^) ds 

,-i 

X t" 

' E 9o ((IV, - W tli )b\X s - %) | T tU ) ds 

,-i 

pf} 

+ 2 ' E eo ((X, - X tU )b(X s - d 0 ) I X tl ) ds. 

Using the Cauchy-Schwarz inequality, Lemma A.4 and Corollary |A. 5 1 again, 

r t" 

' E eo ((X, - X t , u )(W s - W t ’> ^a(X s ) \ Tt; ) ds 
Jt 1 ; . 


<CA 2 (1 +|X C J C ), 


and by Lemma [23] 


■‘do 


((X, - X t njb{X s - do) I X C| ) = (s- f}_,)R(s - f;_ v X tll ; d 0 ), 


so also 


X t" 

' Eg 0 ((X, - Xf { )b{X s \ do) I X^J ds 


< CA 2 (1 + \X tU | c ). 


Now 


i [«d r-1" 

— £ dyg(0, Xr U , X tU ; do) ' E eo ((X, - X^ )(W S - )a(XJ | ) ds 

VA„ , =1 ' ' Jfi_ x 

, [«d ft" 

— V d]g( 0, X t " , X,. ; d 0 ) ' E eo ((X, - X,. )b(X s - d 0 ) | X,- ) ds 

A » ,=i ' ' 


VA„ ( _j 

[«d 

< A^ /2 C 2 |3 2 g(0, X ti , X Ci ; d 0 )| (l + |X ti | c ) —> 0 


i=l 


(5.20) 


(5.21) 
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by Lemma A.3[ so by (5.191 and (5.21 1 , it remains to show that 

1 ^ C ,n <p 

— £ d 2 g( 0, X ti , X tU ; 0 O ) ' E eo ((W, - W, u )b 2 (X s - d 0 ) I T, u ) ds —» 

Applying Ito’s formula with the function 

f(y, w) = (w - w t '- { )b 2 (y, d 0 ), 

and making use of the martingale properties of the stochastic integral terms, yields 

X f 1 

' E eo ((W, - W tli )b 2 (X s ; d 0 ) | ds 

i 

nt" ns 

= J J E 0l) (a(X u )d y b 2 (X u -6 0 )(W u - W tU ) | T tU ) duds 


0. 


r*f} r>s 

2 I M* 

Jt", Jt", 


i-1 i-1 

■V? /~s 


(fc 2 (X u ;d 0 )(9 2 fc 2 (X„;do)(W H - | 7^,) duds 

X r: r*s 

' J E eo (Z?(X U ; d 0 )<9 y fc 2 (X H ; 6 0 ) \ T^) du ds . 

t-1 t-1 

Again, by repeated use of the Cauchy-Schwarz inequality and Corollary | A.5 [ 


r ,n 

' e 6o ((w t > ; - w f; yiXi, 0 O ) i r t ' !t ) ds 

Jt ", 


< C(1 + IZ,;' 1 | C )(A 2 + A^ /2 ). 


Now 


— ^ 5 2 g(0,, X tU ; d 0 ) ' E eo (OL, - )6 2 (X S ; d 0 ) | 7^) </* 

VA„ i=1 

[nr] 

< (A, 3 / 2 + A 2 ) ^ |<9 2 g(0, X , u , ; d 0 )| C(1 + | c ) 0, 


!=1 


thus completing the proof. □ 

Proo f o f Lemma |5..?| The aim of this proof is to establish that the conditions of Theorem 
IX.7.28 in |Jacod and Shiryaev (2003) hold, by which the desired result follows directly. 

For all t e [0,1], 

j [ns] ^ [nr] 

— V E 0O (g(A,„X, s X,. t ; d„) | X tU ) < — V E eo (g^X^X^; d 0 ) | X^J 

V ^/2 l~ 1 


sup 

s<t 


VA/j ( -_j 


and since the right-hand side converges to 0 in probability under Pg 0 by ( |5.1[ ) of Lemma 
|5-H so does the left-hand side, i.e. condition 7.27 of Theorem IX.7.28 holds. From ( |5.2[ ) 
and ( |5.4| ) of Lemma [571] it follows that for all t e [0,1], 

— V E eo (g 2 (A„, X f ,X fii -, d 0 ) | X,n ( ) - Eflb (g(A„, X fi , X fi i ; d 0 ) | X r <. ] )‘ 

A " ,=i v 
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9 


\ f b 4 (X s ;0 o )(d;g(O,X s ,X s -,d o )f ds, 
0 


establishing that condition 7.28 of Theorem IX.7.28 is satished. Lem ma[T2] implies condi¬ 
tion 7.29, while the Lyapunov condition ( |5.3[ ) of Lemma |5. 1 1 implies the Lindeberg condi¬ 
tion 7.30 of Theorem IX.7.28 in Jacod and Shiryaevj ( [2003| ), from which the desired result 
now follows. 

Theorem IX.7.28 contains an additional condition 7.31. This condition has the same form 


as (5.51, but with Wf - IT," j replaced by Nf - /V," where N = (N t )t >o is an y bounded 
martingale on (Q, F, {F t ) t > o, Pe 0 ), which is orthogonal to W. However, since (F t ) t >o is 


generated by U and W, it follows from the martingale representation theorem (Jacod and 


Shiryaev 2003 Theorem III.4.33) that every martingale on (Q, F, (Ft)t>o, Pe 0 ) may be writ¬ 
ten as the sum of a constant term and a stochastic integral with respect to W, and therefore 
cannot be orthogonal to W. □ 


A Auxiliary Results 

This section contains a number of technical results used in the proofs in Section [5T2] 


Lemma A.l. (Genon-Catalot and Jacod 1993 Lemma 9) For i, n e N, let F n .i = F? (with 


Fi,() = Fo), and let F nj be an F n j-measurable, real-valued random variable. If 

n n 

^ E e 0 (F n j | Fn,i-\) F and ^ E ’ 9 0 (fF | T n .i-\) 0 , 


i=i 


for some random variable F, then 


Z F "-' 


9 


/=1 


F. 


i=i 


Lemma A.2. Suppose that Assumptions \2.4\ and \2~5\ hold. Then, for all 9 € 0, 

(i) 


Eg 0 (g(A n ,X t n,X tU -e)\X tU ) 

= \A n (b\X ti ; do) - b 2 (X tU ; 9)) d 2 y g(0, X , u , ; 9) + A^(A„, ; 9), 


(A.l) 


(ii) 


0o (dog{An,X t n,X tU -9) | X, ; .J 

= \A n (b 2 (X fU ; Bo) - b 2 (X tU ; 0)) d 2 d e g(0, X^ , ; 9) 

- \A n d (i b 2 (X tU ; 9)8 2 g( 0, X lU , X , u ; 9) + A 2 n R(A n , X , u ; 9), 


(A. 2) 
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(in) 


Eg 0 (g 2 (A n ,x^,x tli -e)\x tU ) 

= \Al (b\X ; do) + \ (b\X tU ; %) - b 2 (X, u ; t?)) 2 ) (d 2 y g( 0,, X^ ; 6fi) 2 (A.3) 
+ AlR(A n ,X tl -,6), 


(iv) 


(v) 


E eo [{dgg(A n ,X t: ,,X tU -,e)f | A C| ) = AlR(A n ,X tU -0), (A.4) 


J 0b 


(g 4 (A„,A^,A C| ;0) | Z^) - A*R(A n ,X f; ^G ). 


(A.5) 


Proof of Lemma | A72| The formulae ( |A. 1 [ >, ( |A.2 1 and (A.3) are implicitly given in the 
proofs of Sprcnscn ( |2010 Lemmas 3.2 & 3.4). To prove the two remaining formulae, 
note first that using (2.5 ), Assumption 2.5|(i) and Lemma 2.7 


.£' o (g 4 (0; £))(.*,*) = 0, f = 1,2,3 

4o(g 3 (°’% (1) ( 0 )X*’A> = O’ * = 1 .2 

£e 0 (g 2 (0,G)g^W)(x,x)^0 
£e 0 (g\0,9)g a \6))(x,x) = 0 
£e 0 (dgg( 0,G) 2 )(x,x) = 0. 

The verification of these formulae may be simplified by using e.g. the Leibniz formula for 
the n’th derivative of a product to see that partial derivatives are zero when evaluated in 
y = x. These results, as well as Lem mas |2.6| and |2.7[ and (A. 81 are used without reference 
in the following. 


Eg 0 ( [ (d 0 g(A n ,x^x lli -e)f \x tU/ 

= Eg o (d e g(O,X l n,X t; _ i -,0) 2 \X lli ) 

+ 2A n Ee 0 (dog{0, Xf! , X, u ; 6)d e g {X \X t n , X , u ; G) \ X, u ) 

+ A 2 n Eg 0 (R(A n ,X t n,X t n i -e)\X ti ) 

= d e g( 0, X tU , X tU ; G ) 2 + A n £ 0o (d (l g(0, G) 2 )(X tU ,Z^) + A 2 n R(A n ,X , u ; G) 
+ 2A„ [d e g( 0, X fU , X , u ; G)d e g iX \X fU ,Z^; G) + A n R(A n ,X ^; G)) 

= A 2 n R(A n ,X tU -,G), 

proving (|A.4[). Similarly, 


-8 0 


(g\A, u X t n,X lU -G)\X tU ) 
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= E eo (/(O,X,.,X,. i ;0)|X^ i ) 

+ 4A„Eg 0 (g 3 (0, X f; , X tU ; 9)g (l Kx t , ; , X, u ■ G) \ X^) 

+ 6A 2 Ee 0 (g 2 (0, Xq, X tU ; 9)g (l \X,. ; , X , u ; 9 ) 2 | X^) 

+ 2A 2 Eg 0 (g 3 (0, X t , ; , X , u ; 9)g {2 \X fl , X^; 9) \ X, u ) 

+ 4A 3 E eo (g(0, X t; , X tu ; G)g m (X t », X , u ; 9) 3 \ X, u ) 

+ 6A 3 E eo (g 2 (0, Xr, X tU ; G)g {V) (X t n, X , u ; 0)g (2) (X f », X tU ; 9) \ X^) 

+ | A 3 E eo (g 3 (0, X f; , X^; G)g {y HX t n, X , u ; 0) | X^) 

= g 4 (0, X lU , X^; 9) + A„£ 0o (g 4 (O; 0))(X,. t , X ^) + iA 2 £ 2 o (g 4 (0; 0))(X,>,, X^) 

+ i A 3 £ 3 o (g 4 (0; 0))(X^, X^) + 4A„g 3 (0, X f/i , X t?i ; 9)g m (X t?i , X^; 0) 

+ 4A 2 £, 0 (g 3 (0; d)g (1 \0)XX l?Li , X tU ) + 2A 3 £ 2 o (g 3 (0; 0)g (1) (0))(X^_ i , X , u ) 

+ 6A 2 g 2 (0, X tu , X <;Li ; % (1) (X ff i , X tu ; 0) 2 + 6A 3 £, 0 (g 2 (0; 0)g (1) (0) 2 )(X, ; .,, X^) 
+ 2A 2 g 3 (0, X, u , X lU ; G)g (2 HX lU , X , u ; 9) + 2A 3 £ eo (g 3 (0; % (2) (0))(X^, X, u ) 

+ 4A 3 g(0, X tU , X tU ; % (1) (X,. t , X lU ; 0) 3 
+ 6A 3 g 2 (0, X tU , X lU ; % (1) (X ti , X tU ; 0)g (2) (X^ i , X^; 6) 


+ | A 3 g 3 (0, X^, X^; 9)g^(X lU , X tU ; 0) 
+ A 4 X(A )) ,X,» i ;0) 

= A 4 R{A n ,X tU '9), 


( 3 )/ 


which proves (A. 5). □ 

Lemma A.3. Let x i—> f(x ) be a continuous, real-valued function, and let t e [0,1] be 
given. Then 

[nt] 


A f 

i= 1 


f(X s )ds. 


Lemma A.3 follows easily by the convergence of Riemann sums. 


Lemma A.4. Suppose that Assumption \2.4\ holds, and let m > 2. Then, there exists a 

(A.6) 


constant C m > 0, such that for 0 < t < t + A < 1, 

E 6o (|X, +a - X,r I X t ) < Cm A" i/2 (1 + \X t \ m ) - 


Corollary A.5. Suppose that Assumption \2.4\ holds. Let a compact, convex set K c 0 be 
given, and suppose that f(y, x; 9) is of polynomial growth in x and y, uniformly for 9 in K. 
Then, there exist constants Ck > 0 such that for ()< t < t + A < 1, 

Eg 0 (|/(X f+A ,X f , 0)| | X t ) <C K ( 1 + |X,| C ^) 
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for all 9 e K. 


o 


Lemma A.4 and Corollary |A.5| correspond to Lemma 6 of Kessler (1997), adapted to the 
present assumptions. For use in the following, observe that for any 9 6 0, there exist 
constants Cg > 0 such that 

\nt\ [»f] 

< C e A„ (l + \X lU f e ) , 

i=\ i=l 


so it follows from Lemma A.3 that for any deterministic, real-valued sequence (S„)n<sv\ with 
6„ —» 0 as n oo, 


[nt] 


5 n KY J \ R o(K^t U )\^0. 


(A.7) 


i=l 


Note that by Corollary | A. 5 [ it holds that under Assumption [2T4} 


E eo C R (A,X, +A , X,; 9) \ X t ) = R( A, X t ; 9). (A.8) 

Lemma A.6. Suppose that Assumption \2~4\ holds, and that the function f(t,y, x\ 6) satisfies 
that 


fit, y, x, 9) € CfJ j ([0,1] x X 2 x 0) with /(0, jc, jc; 6») = 0 (A.9) 

for all x e X and 9 e 0. Let m e N he given, and let Dk( ■ ',9,9') = k( - \ 9) — k( ■ \ 9'). Then, 
there exist constants C m > 0 such that 

Be 0 (\Df(t-s,X t ,X s -,9,9')\ 2m] J 

< C m (t - s) 2m -' J E eo (| Dffu - s,X u ,X s \ 9, 9'f n) j du (A. 10) 

+ C m (t - s ) m - 1 J E eo (| Df 2 (u - s,X u ,X s -9,9') | 2 "') du 

for 0 < s < t < 1 and 9,9' £ 0, where f\ and fi are given by 

f(t,y,x;9) = d t f (t,y, x; 9) + a(y)d y f (t,y,x,9) + \b 2 (y;9 0 )dyf (t,y,x;9) 
flit, y, x, 9) = b(y; 9 0 )d y f (t, y, x; 9) . 

Furthermore, for each compact, convex set K C @, there exists a constant C m j< > 0 such 
that 


E eo (|Dfjit - s, X t , X s -9,9’)\ 2m ) < C m , K \9 - 9'\ 2m 
for j = 1,2, 0 < s < t < 1 and all 9,9' £ K. 


o 
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Proof of Lemma |/Q| A simple application of Ito’s formula (when conditioning on X s 
x s ) yields that for all 9 e 0, 


f(t- s,X„X s - 9) 



fi {u - s,X u ,X s \9) du + 



f 2 iu- s,X u ,X s \G) dW u 


(A. 11) 


under Pg 0 . 

By Jensen’s inequality, it holds that for any k e N, 


-do 


f 


Dfdu - s,X u ,X s \ 6,9') j du 


<(t- s) h 


'I ' 1 


Dfjiu- s,x u ,x s -e,er)\ jk ^j du 
(A. 12) 


for j = 1,2, and by the martingale properties of the second term in (A.l 1), the Burkholder- 
Davis-Gundy inequality may be used to show that 


If 


Df 2 iu-s,X u ,X s -e,9')dW u 


2 m\ 


< c 


m^Uo 


r 


Dfifu- S,x u ,x s -e, G'f du 


(A. 13) 


Now, (A.ll i, ( A.12[ ) and (A.13) may be combined to show (A.IOi. The last result of the 
lemma follows by an application of the mean value theorem. □ 


Lemma A.7. Suppose that Assumption |Z4| holds, and let K C 0 be compact and convex. 
Assume that fit, y, x; 9) satisfies ( |,4 . 9| ) for all x e X and 9 e 0, and define 

n 

F n (G)^Y.fiA n ,X t n,X tU -9). 

i=i 


Then, for each m e M, there exists a constant C m K > 0, such that 

Ee 0 1 Fniff) - F n i9f)\ 2m < C mK 1 9 - 9'\ 2m 

for all 9,9' e K and n e N. Define F n (9) = A ~ l F n i9), and suppose, moreover, that the 
functions 


hit, y, x\ 9) - d,f it, y, x\ 9) + aiy)d y f it, y, x\ 9) + \b 2 iy\9 0 )d 2 f it,y, x\ 9) 
lifit, y, x\ 9) = biy\9 0 )d y f it, y, x\ 9) 
lijfit, y, x\ 9) = b(y-, 9 0 )d y hjit, y, x, 9) 

satisfy ( |A. 9[ ) for j — 1,2. Then, for each m e N, there exists a constant C, il k > 0, such that 

Ee 0 1 F,f9) - F n i<f)\ 2m < C nhK 1 9 - 9'\ 2m 

for all 9,9' € K and n € N. o 


30 


















Proof of Lemma |/L7| For use in the following, define, in addition to h \, h 2 and hji, the 
functions 


hj\ {t, y,x;G) = d t hj(t, y, x; 6) + a(y)d y hj(t, y, x; G) + \b 2 (y\ G 0 )dyhj(t, y, x; G) 
hj 2 \(t,y, x; G) = d t hj 2 (t,y, x; G) + a(y)d y h j2 (t,y, x; G) + \b 2 (y, G 0 )dyh j2 (t,y, x; 6) 
hjilit, y, x ; G) = b(y m , Qo)d y h j2 (t, y, x; G) 


for j - 1,2, and, for ease of notation, let 

H n /(u- G,G') = Dhfu - rf_ p X„, X tU -G,6') 

for j e (1,2,11,12,21,22,121,122,221,222!, where Dk(-\G,G') = k(-;G) - k(-;G'). Re¬ 
call that A„ = 1 /n. 

First, by the martingale properties of 


n „ r 

■nj] l^fu^Xu-G^fdWu, 

i= 1 


the Burkholder-Davis-Gundy inequality is used to establish the existence of a constant 
C m > 0 such that 


=■00 


II pi 1 : 

A„V ' H’fXu-6,G r 


G’)dW u 


2 m\ 


< c. 


m^e 0 


A 2 Y f‘ H^(u;G,G' 


(w; G, G')~ du 


Now, using also Ito’s formula, Jensen’s inequality and Lemma A.6 

„ 2m\ 


-8o 


A„ ^ Df (A„, Xf , X t » t \G,G') 


i=l 


< C 


nv-*(k) 




2m'' 

( 


2m > 

( A "§ 

e, <9') du 


+ C m E0 o I 

A "Zj j,, H 2 l ( U ’ d : B, ) dW U 

/— 1 t—1 


5- 

i— 1 

J ' Ftf 0; 0,<9') du 

> 

2m ^ 

y 

V 

+ C m E^ 0 1 

A » 2 I ^ 6 '') 2 

1=1 Y-1 

/ 

n \ 

y 


< c, 

<C m A 2m+1 Y Ee 0 -J- f'Hf(u;G,G')du 

i=i A " 


2m\ 


+ . 




G, G') 2 du 


m.\\ 


c m A 2m V ' E eo (|//', M '(n; 0,0')l 2m ) dw + I ' E 0(1 0,0')l 2m ) du (A. 14) 

i=i W?_i / 


< 


<C m ,^-0'| 2m A 2m , 


thus 


Ee 0 (\DF n (G, 0')l 2m ) = ^« 2m ®0 o 


A n y D/(A„,^,Ai t| ;0,0') 


i= 1 


2m \ 


<C m ,^-0'| 2m 
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for all 9,9' e K and n e N. In the case where also h/ and hj 2 satisfy (A.91 for all x e X, 
9 e 0 and j = 1,2, use Lemma A.6 to write 

Efi 0 9, 0')| 2m ) 

< C m (u - tl,) 2 " 2 -' £ Eg 0 (\Hl((v, 9,9'f m ) dv 


+ C, 
L Cm 


(u - 1 7_ 1 ) M - 1 f Ee 0 (|//; ! 2 '(v; 0,0')| 2m ) rfv 
(« - t n t _ i) 2 "'" 1 f E eo (|//;'’/(v; 9, 0')| 2 '”) 

+ C m (u - tl 1 ) m - 1 r (V - tf., ) 2 " 1 ” 1 £ Ee 0 {\H'^(W, 9, 0')| 2 "') dw 

^-1 V ^L-l 

+ C m (u - r- 1 £ | (V - £ E eo (|^( W ; 0,0')| 2m ) dwj dv 

< C m ,K\9 - 9’\ lm ((a - t'l,) 2m + (u - , 

and similarly obtain 

E, 0 0')l 2m ) < C m ,^|0 - 0'| 2 m ((« - i^) 2 '" + (« - £t) 3m ) ■ 

Now, inserting into ( A.14[ ), 

n 

A (1 ^D/(A„A, : A ):| ;0,0') 


1=1 


2m\ 


n ( r r i 

2m V ' E eo 0, <9')| 2 "') du+ ' E eo 9,9')\ 2m ) du 

ti W?-i J 

< C m ,*|0 - m A* m 2 f ((« - t'^) 2 '” + (« - ^i) 3m ) du 

i= l ~A-i 


< C m vX 


<c m , K \e-9'\ 2m (Atr + A 5 n m ) 

and, ultimately, 


E eo (|Di? 1 (0,9')l 2 ”') = E eo 


A” 1 £^Df(A n ,X t n,X t n i ;6,9') 


i= 1 


2m \ 


= a,; 4 "'e 0o 


A« ^ Df(A„, X t n,X t n ^; 0,0') 


i=l 


2m\ 


<C mX \9-9V n { 1+A„) 
<C m;Jf |0-0'| 2m . 


Lemma A.8. Suppose that Assumption 2.4 is satisfied. Let f e C7°/ (A X 0). Define 


0,1 


-f 


F(9)= f(X s -9)ds 
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and let K Q & be compact and convex. Then, for each m e N, there exists a constant 
C m K > 0 such that for all 6, O' e K, 

B eo \F(0) - F(9')\ 2m < C mX \0-9'\ 2m . 


o 


Lemma A.8 follows from a simple application of the mean value theorem. 


Lemma A.9. Let K c 0 be compact. Suppose that H„ = {H,f9))g e K defines a sequence 
f of continuous, real-valued stochastic processes such that 


H n (9 ) 



0 


point-wise for all 9 e K. Furthermore, assume that for some m e N, there exists a constant 
C m ,K > 0 such that for all 9,9' 6 K and n e N, 

E 0O | H n (9) - H n (9') | 2 "' < C m , K \0 - 0'\ 2m . (A. 15) 


Then, 


sup \H n {9)\ 0. 

6eK 


o 


Proof of Lemma pL9] (H n (9)) ne ^ is tight in R for all 9 e K, so, using (A. 15 !, it follows 
from |Kallenberg| ( [20021 , Corollary 16.9 & Theorem 16.3) that the sequence of processes 
(H„ )„ 6 fj is tight in C(K, R), the space of continuous (and bounded) real-valued functions on 
K, and thus relatively compact in distribution. Also, for all d e N and (6\, ..., 9 ( {) £ K d , 


'H n (e x f 

D 

dr 

,H„(9ci), 

-> 



so by Kallenberg 


(2002 


£> 


Lemma 16.2), H„ —> 0 in C(K, R) equipped with the uniform 

D 

metric. Finally, by the continuous mapping theorem, sup fte ^ \H„(9)\ —> 0, and the desired 
result follows. □ 
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